Generating a combined image from multiple images

ABSTRACT

A determination is made for each of multiple regions in multiple images of how good that region is perceived as being. A base image is identified, and a combined image is generated from the multiple images by automatically replacing each region of the base image with a corresponding region of another image if the corresponding region has been determined as being better than the region of the base image. The generating of the combined image can include automatically selecting from one of the multiple images a region in which an object that is present in one or more corresponding regions of other images is absent. Additionally, for a particular region of the base image, corresponding regions of the other images can be displayed, and the particular region replaced with a user-selected one of the corresponding regions of the other images.

BACKGROUND

Users frequently take pictures of groups of objects, such as groups ofpeople (e.g., family members or friends), groups of animals (e.g.,pets), and so forth. Unfortunately, it is oftentimes difficult to getall of the objects in an acceptable position or pose at the time thepicture is taken. For example, for a group of people, when the pictureis taken one or more people may be blinking, frowning, looking away fromthe camera, and so forth. It can also be difficult to get all of theobjects in the picture without other extraneous objects being present,such as additional people walking through the picture. Thesedifficulties can lead to situations where users do not get the picturesthey desire, and can result in user frustration when trying to takepictures.

SUMMARY

This Summary is provided to introduce a selection of concepts in asimplified form that are further described below in the DetailedDescription. This Summary is not intended to identify key features oressential features of the claimed subject matter, nor is it intended tobe used to limit the scope of the claimed subject matter.

In accordance with one or more aspects, multiple images that eachinclude multiple objects are accessed. A determination is made for eachof multiple regions in the multiple images, of how good that region isperceived as being. A combined image is generated from the multipleimages based on the determinations of how good corresponding ones of themultiple regions in the multiple images are perceived as being. Thegenerating of the combined image includes automatically selecting, forinclusion in the combined image, a region from one of the multipleimages in which an object that is present in one or more correspondingregions of other images of the multiple images is absent.

In accordance with one or more aspects, multiple images that eachinclude multiple objects are accessed. A determination is made, for eachof multiple regions in the multiple images, of how good that region isperceived as being. A base image of the multiple images is identified,which can be the image having the most regions determined as beingperceived as the “best” regions. A combined image is generated from themultiple images by automatically replacing each region of the base imagewith a corresponding region of another image of the multiple images ifthe corresponding region has been determined as being better than theregion of the base image. Additionally, regions of each of the otherimages of the multiple images are displayed, with each of the regionscorresponding to a particular region of the base image. A user selectionof one of the corresponding regions of the other images of the multipleimages is received, and the particular region of the base image isreplaced with the user-selected one of the corresponding regions of theother images.

BRIEF DESCRIPTION OF THE DRAWINGS

The same numbers are used throughout the drawings to reference likefeatures.

FIG. 1 illustrates an example system implementing the generating acombined image from multiple images in accordance with one or moreembodiments.

FIG. 2 shows an example of multiple images of the same scene inaccordance with one or more embodiments.

FIG. 3 shows an example of automatically replacing regions in one imagewith corresponding regions from other images to generate a combinedimage in accordance with one or more embodiments.

FIG. 4 shows an example of a combined image in accordance with one ormore embodiments.

FIG. 5 shows an example of a user interface via which a user can provideinput regarding which of multiple corresponding regions is to beselected in accordance with one or more embodiments.

FIG. 6 is a flowchart illustrating an example process for generating acombined image from multiple images in accordance with one or moreembodiments.

FIG. 7 is a flowchart illustrating an example process for selectingregions to include in the combined image based on user input inaccordance with one or more embodiments.

FIG. 8 illustrates an example computing device that can be configured toimplement the generating a combined image from multiple images inaccordance with one or more embodiments.

DETAILED DESCRIPTION

Generating a combined image from multiple images is discussed herein.Multiple images of the same scene are captured, and although the imagesare of the same scene there can be differences between the images.Different regions within those images are identified and determinationsare made for each of the different regions of how good the region isperceived as being (e.g., if the region represents a face, whetherpeople are smiling, whether people have their eyes open, etc). Thedeterminations of how good the region is perceived as being can be madeby an evaluation module based on various characteristics of the region.A base image of the multiple images is selected, typically by selectingthe image having the most regions that will not be replaced by acorresponding region from the other of the multiple regions. When aregion of another of the multiple regions is determined to be perceivedas better than the corresponding region of the base image, that regionin the base image can be automatically replaced by the correspondingregion from the other of the multiple regions.

A user interface can also be displayed to the user for a particularregion in the base image, displaying to the user the correspondingregions of the other of the multiple images and allowing the user toselect one of those corresponding regions that is to replace the regionin the base image. Additionally, situations where an object is presentin a region of the base image, but is not present in correspondingregions of one or more other of the multiple images (such as when aperson is walking through the background of a scene) can be identified.In such situations, the region without the object present can optionallybe automatically selected as the region to replace the correspondingregion in the base image.

FIG. 1 illustrates an example system 100 implementing the generating acombined image from multiple images in accordance with one or moreembodiments. System 100 can be implemented as part of one or more of avariety of different types of devices, such as a desktop computer, amobile station, a kiosk, an entertainment appliance, a set-top boxcommunicatively coupled to a display device, a television, a cellular orother wireless phone, a camera, a camcorder, an audio/video playbackdevice, a game console, an automotive computer, and so forth.Alternatively, system 100 can be implemented across multiple devices ofthe same or different types. Such multiple devices can be coupled to oneanother in a variety of different manners, such as via a wired orwireless connection (e.g., a Universal Serial Bus (USB) connection, awireless USB connection, a connection in accordance with the IEEE 1394standard, etc.), or via a network (e.g., the Internet, a local areanetwork (LAN), a cellular or other phone network, etc.), and so forth.

System 100 includes an image generation module 102, an object database104, and a user interface module 106. Module 102, database 104, andmodule 106 can be implemented as part of the same, or alternativelydifferent, devices. Image generation module 102 receives multiple images110 and generates a combined image 112 by selecting different regionsfrom different ones of the multiple images 110. Object database 104 is arecord of objects that are recognized by or otherwise known to system100. Object database 104 can be, for example, a record of differentfaces and associated names that are known to system 100. Object database104 can include multiple images with objects being identified in each ofthese multiple images and database 104. For example, object database 104can be a digital photo album (e.g., maintained by an online service)including multiple different images in which people have been identified(with the same and/or different people being identified in differentones of these images). This record of objects can be maintained using avariety of different data structures or storage techniques. Userinterface (UI) module 106 manages presentation of information to a userof system 100 and receipt of requests from a user of system 100.

Images 110 can be obtained by image generation module 102 in a varietyof different manners. Images 110 can be captured by a device includingmodule 102, can be provided to module 102, can be stored in a locationidentified to module 102 from which module 102 retrieves images 110, andso forth. For example, image generation module 102 can be implemented aspart of an Internet service to which a user uploads or otherwisetransfer images 110. By way of another example, image generation module102 can be implemented as part of a digital camera that captures images110. By way of yet another example, image generation module 102 can beimplemented as an in-store kiosk that retrieves images from a memorydevice coupled to the kiosk.

Image generation module 102 includes an object detection module 122, anevaluation module 124, an image combining module 126, and an imageregistration module 128. The operation of modules 122, 124, 126, and 128are discussed generally here, and in more detail below. Generally,object detection module 122 detects regions within the images 110.Evaluation module 124 makes a determination for each of these regions ofhow good the region is perceived as being. These determinations can bemade in a variety of different manners as discussed in more detailbelow. Based on these determinations, image combining module 126 selectsone of the multiple images 110 to be a base image and then automaticallyselects different regions of other images of the multiple images 110 toreplace corresponding regions of the base image in order to generatecombined image 112. For a given region, typically the correspondingregion determined as being perceived as the “best” region is the regionthat is included in combined image 112. Image registration module 128 isoptionally included in image generation module 102, and when includeddetermines how images map to one another. This mapping refers to whichportions of images are regions that correspond to one another.

Regions typically, but not always, include objects. Image combiningmodule 126 can automatically select a corresponding region of anotherimage that includes no object, even though the base image may includethe object. Thus, an object in the base image can be deleted or removedfrom a scene and not included in combined image 112. Additionally, userinterface module 106 can allow a user to override the automaticselections made by image combining module 126 as discussed in moredetail below.

Object detection module 122 can be configured to detect a variety ofdifferent types of objects within regions of images 110. These types ofobjects can be, for example, people (or faces of people) and/or animals(e.g., pets). Alternatively, other objects can be detected, such asbuildings, landscaping or other geographic features, cars or othervehicles, items or human organs (e.g., on x-ray images), and so forth.Object detection module 122 is typically configured to detect one typeof object, although alternatively can be configured to detect any numberof different types of objects.

Multiple images 110 received by image generation module 102 aretypically of the same scene, such as multiple images of a group ofpeople at a wedding or family reunion. In one or more embodiments,object detection module 122 can detect whether one or more of themultiple images 110 are not of the same scene. Images that are detectedas not being of the same scene are automatically removed and notconsidered by image generation module 102 as included in the multipleimages 110. Whether images are of the same scene can be determined in avariety of different manners. For example, two images having at least athreshold number of the same objects can be determined as being of thesame scene. This threshold number can be a fixed number (e.g., 5 or moreof the same objects) or a relative number (e.g., 60% or more of theobjects in the images are in both images). By way of another example, auser of system 100 can provide input indicating which images are of thesame scene. Thus, even though two images may be of the same scene, thetwo images need not be (and typically are not) identical.

In one or more embodiments, image registration module 128 determineswhich images are of the same scene. Image registration module 128 uses aregistration technique to determine how images map to each otherspatially. If two images map to one another well enough (e.g., at leasta threshold number of matching features are included in each of theimages), then the two images are determined to be of the same scene.Matching features can be identified using a variety of differentconventional techniques, such as using a scale-invariant featuretransform (SIFT) algorithm.

Even though images 110 are of the same scene, the objects within thescene can be different. For example, an unknown person can be walkingbehind a group of people and thus appear in different locations indifferent images 110. By way of another example, one person in the groupof people may move and thus be in different locations in differentimages 110. By way of yet another example, people may move their heads,talk, blink, etc., and thus be in different locations or poses indifferent images 110.

Object detection module 122 also aligns the multiple images 110.Aligning the multiple images 110 refers to identifying different regionsof the images 110 that correspond to one another (e.g., include the sameobject). As part of this alignment process, for each of the multipleimages 110 object detection module 122 identifies objects within theimage, identifies a region of the image that includes that object, andalso identifies, for a region that is identified in one image,corresponding regions in different images 110. These correspondingregions in different images 110 are typically in approximately the samelocation of the scene. Accordingly, when object detection module 122identifies a region in one image, module 122 also identifiescorresponding regions in the same location of the scene in the otherimages. These corresponding regions can, but need not, include the sameobjects. For example, as discussed in more detail below, one region mayinclude an object (e.g., a person walking behind a group of people) thatis absent from a corresponding region of another image.

Corresponding regions in different images 110 can be determined indifferent manners. For example, image registration module 128 can use aregistration technique to determine how images map to each otherspatially. Matching features in images 110 are identified, and thelocations of those features in the images 110 are identified. Particularobjects (e.g., faces) within those matching features are identified, andregions that surround those particular objects are identified.

In one or more embodiments, the identification of regions in images 110is based at least in part on object recognition. Object database 104 isa record of objects that are recognized by or otherwise known to system100. Object database 104 can be generated in a variety of differentmanners, such as based on input from a user of system 100 identifyingparticular objects (e.g., tagging objects in their digital photo album),information identifying particular objects obtained from othercomponents or devices, and so forth. Object detection module 122 usesthe information in object database 104 to automatically detect knownobjects (objects known to system 100) in images 110. The presence ofthese known objects in particular locations in images 110 can then beused to identify a region around the detected objects.

In alternate embodiments, object detection module 122 can operatewithout object database 104. In such embodiments, object detectionmodule 122 detects particular objects within images 110, and alsodetects when an object in one image 110 is the same as an object inanother of the images 110. Although object detection module 122 may notbe identifying known objects in such embodiments, object detectionmodule 122 can still detect when objects in multiple images are thesame.

The detection of objects in images can be performed in a variety ofdifferent conventional manners. It is to be appreciated that the mannerin which objects are detected in images can vary based on the particularobjects being detected. For example, different techniques can be used todetect human faces than are used to detect animal faces or otherobjects.

The aligning of the multiple images and the identification of regionsaround objects (including the identification of seams along which theregion can be “cut” for removal or copying from an image, and thejoining or pasting of one region into another image) can be performed ina variety of different conventional manners. In one or more embodiments,the aligning of the multiple images and the identification of regionsaround objects are performed using a photomontage technique for splicingregions of images together discussed in more detail in A. Agarwala, etal., “Interactive Digital Photomontage”, ACM SIGGRAPH 2004. In one ormore embodiments, the splicing of a region from one image into anotheris performed using automatic selection and blending techniques. Anexample of an automatic selection technique is discussed in more detailin C. Rother, et al., “GrabCut: Interactive Foreground Extraction UsingIterated Graph Cuts”, ACM SIGGRAPH 2004, and an example of a blendingtechnique is discussed in more detail in A. Criminisi, et al., “RegionFilling and Object Removal by Exemplar-Based Inpainting”, IEEETransactions on Image Processing, vol. 13, no. 9, pp 1200-1212, January2004.

FIG. 2 shows an example of multiple images of the same scene inaccordance with one or more embodiments. FIG. 2 illustrates three images202, 204, and 206 of the same scene that are, for example, multipleimages 110 of FIG. 1. Although only three images are illustrated in theexample of FIG. 2, it is to be appreciated that any number of images canbe used with the techniques discussed herein.

Each of the images 202, 204, and 206 includes multiple regionsillustrated as ovals, although it is to be appreciated that a region canbe any shape. Each of these regions is illustrated as being the samesize, although it is to be appreciated that regions can be differentsizes. Each of these regions can include an object as discussed above.Each image 202, 204, and 206 is illustrated as including five regions,although it is to be appreciated that any number of regions can beincluded in an image.

Image 202 includes regions 210, 212, 214, 216, and 218. Image 204includes regions 220, 222, 224, 226, and 228. Image 206 includes regions230, 232, 234, 236, and 238. Different regions in different images thatare at approximately the same location are corresponding regions. Forexample, regions 210, 220, and 230 are corresponding regions. By way ofanother example, regions 214 and 224 are corresponding regions.

Returning to FIG. 1, evaluation module 124 analyzes the images 110,making a determination of how good each region of the images 110 isperceived, by evaluation module 124, as being. Based on thesedeterminations, one of multiple corresponding regions can readily bedetermined as the “best” region of the multiple corresponding regions.Evaluation module 124 can make a determination of how good a region ofan image is using a variety of different rules or criteria, and cangenerate a value reflecting this determination. The value generated byevaluation module 124 can be, for example, a score for a region thatindicates how good module 124 perceives the region as being to becompared to other regions, a ranking for a region that indicates howgood module 124 perceives the region as being compared to other regions,and so forth. Of multiple corresponding regions, the region having the“best” (e.g., highest) value can be selected as the “best” region of themultiple corresponding regions.

In embodiments where evaluation module 124 generates a score for aregion that indicates how good the region is perceived as being,typically regions with higher scores (e.g., larger numerical values) areperceived as being better than regions with lower scores (e.g., smallernumerical values). The score can be determined in a variety of differentmanners. In one or more embodiments the score is determined byevaluating one or more of various characteristics of the region.Evaluation module 124 is configured with or otherwise has access toweights associated with the various characteristics that affect thescores of the regions, with some characteristics being associated withhigher weights than other characteristics. Different characteristics ina region can increase the score of the region or decrease the score ofthe region (e.g., depending on the weights of the particularcharacteristic). In other embodiments, the score is determined based ona learning process in which a component or module (such as evaluationmodule 124) automatically learns which attributes of a region are to begiven higher scores. For example, a neural net, decision tree, or otherlearning machine can be used to learn, based on user feedback of regionsthat the user identifies as good or bad, the characteristics of theregions that the user identifies as good and the characteristics of theregions that the user identifies as bad. This neural net, decision tree,or other learning machine can then be used to assign scores to thedifferent regions in the images.

Alternatively, a determination of how good a region is perceived asbeing by evaluation module 124 can be made by comparing multiplecorresponding regions, such as by using a neural net, decision tree, orother learning machine. This comparison can be based on evaluating oneor more of a variety of different characteristics of a region. Based onthis comparison, one of the multiple corresponding regions is selectedas being perceived as the “best” region of the multiple correspondingregions. The one of the multiple corresponding regions that is perceivedas the “best” of the multiple corresponding regions can optionally bedetermined automatically using a neural net, decision tree, or otherlearning machine. A ranking can be assigned to these regions (e.g.,ranking the regions in order from the region perceived as “best” to theregion perceived as “worst”). Alternatively, a score can be assigned tothese regions (e.g., values of “best” or “not best”), or one of theregions can be flagged or otherwise identified as being perceived as the“best” region of the multiple regions.

In embodiments in which the determination of how good a region isperceived as being is made by evaluating one or more characteristics ofa region, these characteristics can include characteristics of an objectwithin the region and/or other characteristics of the region. Followingis a list of several different characteristics that can be used byevaluation module 124 in determining how good a region is perceived asbeing. These characteristics are: object is tagged with commonly usedtag, object is tagged, object rectangle or region has been drawn orconfirmed by user, object recognition has a high confidence suggestion,object detector has found an object present, eye data is perceived asgood, smile data is perceived as good, image is underexposed, image isoverexposed, and object is blurry. It is to be appreciated that thesecharacteristics are examples, and that other characteristics canalternatively be used.

Object is Tagged with Commonly Used Tag.

The region includes an object that has been identified as a known object(based on object database 104) and is a commonly tagged object. A taggedobject is an object the identity of which has been identified by a userof system 100. The identity of the object can be maintained as part ofthe image that includes the region (e.g., in metadata associated withthe image) or alternatively separately (e.g., in a separate record ordatabase). A commonly tagged object is an object the identity of whichhas been frequently identified by a user of system 100 in the same ordifferent images. This frequency can be determined based on a fixedvalue (e.g., the object has been identified five times in five differentimages by a user, or the object is one of the top five most frequentlyidentified objects) or on a relative value (e.g., the object has beenidentified by a user more often than 90% of the other objects in objectdatabase 104). For example, if object database 104 includes multipleimages of people, a user of system 100 can tag people in those images byidentifying (e.g., by name) those people. People in the images of objectdatabase 104 that are tagged more frequently than other people in theimages are commonly tag objects.

Object is Tagged.

The region includes an object that is a tagged object. A tagged objectis an object the identity of which has been identified by a user ofsystem 100. A tagged object is similar to a commonly tagged object,except that the object has not been frequently identified by user ofsystem 100.

Object Rectangle or Region has been Drawn or Confirmed by User.

The region includes a rectangle or other geometric shape around anobject. A rectangle or other shape can be drawn around an object by auser of system 100. Such a rectangle or other shape can be drawn indifferent manners, such as system 100 displaying an image 110 thatincludes the object and receiving via a user interface an indication ofthe rectangle or other shape (e.g., via a pointer, via a finger orstylus on a touchscreen, and so forth). Alternatively, a rectangle orother shape can be automatically drawn around an object by anothercomponent or module, and the location of that rectangle or other shapeconfirmed by a user of system 100. The rectangle or other shape drawnaround an object indicates that an object is present within thatrectangle or other shape, although the identity of the object has notyet been identified by a user system 100.

Object Recognition has a High Confidence Suggestion.

The region includes an object that has been automatically identifiedwith a high probability of accuracy. Such an object is identified by aparticular component or module rather than by a user of system 100. Theobject can be identified by object detection module 122 or alternativelyanother component or module. The high probability of accuracy can beidentified in different manners, such as based on a fixed value (e.g.,at least a 95% probability of accuracy) or a relative value (e.g., ahigher probability than 80% of the other objects detected by thecomponent module).

Object Detector has Found an Object Present.

The region includes an object that has been automatically identified bya particular component or module rather than by a user of system 100.The object may be identifiable by object detection module 122 oralternatively another component or module.

Eye Data is Perceived as Good.

In embodiments where the object includes a face, a value representinghow good the eyes in each face are perceived as being can be generated.This value can reflect, for example, whether eyes are detected as beingpresent in each face (e.g., as opposed to being obscured from view dueto a head being turned or a hand covering the eyes), whether eyes aredetected as being open (e.g., as opposed to closed due to blinking),whether there is well-defined catchlight in the eye, and so forth. Avariety of different conventional techniques can be employed to detecteyes within a face, determine whether eyes are open, identify catchlightin eyes and so forth. The value can be generated, for example, byassigning a larger number if open eyes without catchlight are detectedin a face, a smaller value if open eyes with catchlight are detected ina face, an even smaller value if closed eyes are detected in a face, andso forth. A larger number can alternatively be assigned if a catchlightthat enhances the image is detected in a face (e.g., based on whetherthe orientation of the catchlight in the eyes matches the orientation ofthe catchlights in the eyes of other faces (in other regions) in theimage), and a smaller number assigned if a catchlight that does notenhance the image is detected in a face. Alternatively, a ranking orvalue indicating how good the eyes in a face are perceived as being canbe determined by a learning process (such as a neural net, decisiontree, or other learning machine) that automatically learns whatattributes of a face indicate how good eyes are (e.g., based on userfeedback of what is good).

Smile Data is Perceived as Good.

In embodiments where the object includes a face, a value representinghow good the smile in each face is perceived as being can be generated.This value can be generated to indicate, for example, whether a mouth isdetected as being present in each face (e.g., as opposed to beingobscured from view due to a head being turned or a hand covering themouth), whether a smile is detected as being present (e.g., as opposedto a frown being present or a tongue sticking out), and so forth. Avariety of different conventional techniques can be employed to detectwhether a mouth is present in a face, whether a smile is present withina face, and so forth. The value can be generated, for example, byassigning a larger number if a smile is detected in a face, a smallervalue if a closed mouth is detected in a face, an even smaller value ifno mouth is detected in a face, and so forth. Alternatively, a rankingor value indicating how good the smile in a face is perceived as beingcan be determined by a learning process (such as a neural net, decisiontree, or other learning machine) that automatically learns whatattributes of a face indicate how good a smile is (e.g., based on userfeedback of what is good).

Image is Underexposed.

The image is determined to be underexposed. This determination can bemade based on the entire image, based on all regions in the image, or ona region by region basis. Whether the image is underexposed can bedetermined in different matters, such as based on exposure valuesderived from a histogram of the image or one or more regions of theimage. Whether the image is underexposed can also be determined based atleast in part on exposure values determined for other of the multipleimages 110. For example, an image having an exposure value that is atleast a threshold amount less than the exposure values for the othermultiple images can be determined to be underexposed. This thresholdamount can be a fixed amount (e.g., a particular part of a histogram ofthe image is less than the same part of the histogram for other images)or a relative amount (e.g., a particular part of a histogram of theimage is at least 10% less than the same part of the histogram for otherimages).

Image is Overexposed.

The image is determined to be overexposed. This determination can bemade based on the entire image, based on all regions in the image, or ona region by region basis. Whether the image is overexposed can bedetermined in different matters, such as based on exposure valuesderived from a histogram of the image or one or more regions of theimage. Whether the image is overexposed can also be determined based atleast in part on exposure values determined for other of the multipleimages 110. For example, an image having an exposure value that is atleast a threshold amount greater than the exposure values for the othermultiple images can be determined to be overexposed. This thresholdamount can be a fixed amount (e.g., a particular part of a histogram ofthe image is greater than the same part of the histogram for otherimages) or a relative amount (e.g., a particular part of a histogram ofthe image is at least 10% greater than the same part of the histogramfor other images).

Object is Blurry.

An object in the region is detected as being blurry. Whether the objectis blurry, a degree of blurriness for the object, or a type ofblurriness (e.g., depth of field blur, motion blur, camera shake blur,and so forth) can be identified in a variety of different conventionalmanners.

Each characteristic used by evaluation module 124 (such as thosediscussed above) has an associated weight, and different characteristicscan have different associated weights. For example, the characteristicsregarding whether the image is overexposed, whether the image isunderexposed, and whether the object is blurry can have lower associatedweights than the other characteristics. The weight for a characteristiccan be, for example, a particular value (such as a numerical value) or aset of values (e.g., a set of multiple numerical values).

In one or more embodiments, one or more of these weights are used togenerate a score for the region. The score is used to identify whichregion is perceived as being the “best” (e.g., the region having thehighest score is perceived as being the “best”). The score for a regioncan be generated in a variety of different manners. In one or moreembodiments, evaluation module 124 generates a characteristic score orvalue for each characteristic of a region evaluated by module 124 (e.g.,a characteristic score indicating whether a region includes an objectthat has been identified as a known object and is a commonly taggedobject, a characteristic score that is a value representing how good theeyes in each face are perceived as being, etc.). These characteristicscores are normalized so that the characteristic scores for the variouscharacteristics evaluated by evaluation module 124 have the same range.For each characteristic evaluated by evaluation module 124, module 124determines the product of the characteristic score and the weight, andadds these products for the various characteristics that were evaluatedtogether to obtain a score for the region. In other embodiments, thecharacteristic scores for the various characteristics evaluated arecombined (e.g., added together, averaged together, etc.) without beingnormalized and/or multiplied by a weight to determine a score for theregion. In other embodiments, one of these characteristic scores (e.g.,the characteristic score that is the largest value) can be selected asthe score for the region. In other embodiments, the characteristics canbe analyzed in a prioritized order (e.g., by a neural net, decisiontree, or other learning machine), and a score for the region assignedbased on the characteristics.

Image combining module 126 uses the determinations of how good theregions are perceived as being to select one of the multiple images 110as a base image. This base image serves as a starting point for thecombined image 112 being generated, and can have regions replaced withcorresponding regions from other images to generate the combined image112. In one or more embodiments, an image score is calculated bycombining (e.g., adding together, averaging together, etc.) the scoresfor the regions of the image. The base image is selected as the imagehaving the largest image score. Alternatively, the base image can beidentified in different manners, such as selecting the image having theregion with the highest score as the base image, selecting the imagehaving the largest number of regions that have been determined to be the“best” regions relative to the corresponding regions of other images,selecting the base image randomly or according to some other rules orcriteria, and so forth.

For each region in the base image, image combining module 126 determineswhether to keep the region or replace the region with a correspondingregion of another of the multiple images. Image combining module 126makes this determination by automatically selecting the one of thecorresponding regions having been determined to be the “best” region (asdetermined by evaluation module 124 as discussed above). For example,referring to FIG. 2, assume that image 204 is the base image. Imagecombining module 126 determines which one of corresponding regions 212,222, and 232 is determined to be the “best” region. If region 222 isdetermined to be the “best” region, then image combining module 126keeps region 222 in image 204 to generate the combined image. However,if region 212 or 232 is determined to be the “best” region, then imagecombining module 126 automatically replaces region 222 in image 204 withthe one of region 212 and 232 that is determined to be the “best”region.

It should be noted that a particular region of the base image thatincludes an object can be automatically replaced by image combiningmodule 126 with a corresponding region of another image from which theobject is absent. For example, referring to FIG. 2, assume that image204 is the base image. Further assume that region 218 and region 228both include the face of a person that is not recognized as being aknown individual, and thus region 218 and region 228 are both assigned alow (possibly negative) score by evaluation module 124. Further assumethat the person was walking through the scene as the images werecaptured, and thus the person is not included in corresponding region238. Evaluation module 124 can assign region 238 a higher score thanregions 218 and 228 because region 238 does not include a face of aperson that is not recognized as being a known individual. Thus, region238 is determined to be the “best” region of corresponding regions 218,228, and 238. Accordingly, image combining module 126 automaticallyreplaces region 228 in image 204 with region 238 to generate thecombined image. By replacing region 228 with region 238, a region inwhich an object is present (region 228) is automatically replaced by aregion in which the object is absent (region 238).

FIG. 3 shows an example of automatically replacing regions in one imagewith corresponding regions from other images to generate a combinedimage in accordance with one or more embodiments. FIG. 3 illustratesthree images 302, 304, and 306 of the same scene that are, for example,images 202, 204, and 206, respectively, of FIG. 2. Although only threeimages are illustrated in the example of FIG. 3, it is to be appreciatedthat any number of images can be used with the techniques discussedherein.

Image 302 includes regions 310, 312, 314, 316, and 318. Image 304includes regions 320, 322, 324, 326, and 328. Image 306 includes regions330, 332, 334, 336, and 338. The corresponding regions that aredetermined to be the “best” region are illustrated in FIG. 3 withcross-hatching. Accordingly, region 320 is determined to be the “best”region of the set of corresponding regions 310, 320, and 330. Similarly,region 332 is determined to be the “best” region of the set ofcorresponding regions 312, 322, and 332, region 314 is determined to bethe “best” region of the set of corresponding regions 314, 324, and 334,region 336 is determined to be the “best” region of the set ofcorresponding regions 316, 326, and 336, and region 338 is determined tobe the “best” region of the set of corresponding regions 318, 328 and338. In one or more embodiments, image 306 has the largest number ofregions that are determined to be the “best” region, and thus image 306is selected by evaluation module 124 as the base image.

Region 320 is determined to be the “best” region of regions 310, 320,and 330, so image combining module 126 automatically replaces region 330with region 320 in the combined image. Similarly, region 314 isdetermined to be the “best” region of regions 314, 324, and 334, soimage combining module 126 automatically replaces region 334 with region314. Region 332 is determined to be the “best” region of regions 312,322, and 332, so region 332 is kept in the combined image. Similarly,regions 336 and 338 are kept in the combined image as they aredetermined to be the “best” regions relative to their the correspondingregions of other images.

It should be noted that an image can include areas or portions that arenot identified as a region. For example, image 306 includes areas thatare not part of regions 330, 332, 334, 336, and 338. For such areas thatare not identified as a region, image combining module 126 keeps thoseareas from the base image and does not replace those areas with areasfrom another image. Alternatively, such areas can be treated as anadditional one or more regions, with evaluation module 124 determininghow good such areas are perceived as being and image combining module126 automatically replacing the area in the base image with thecorresponding area of another image based on the determinations.

FIG. 4 shows an example of a combined image in accordance with one ormore embodiments. FIG. 4 illustrates image 400, which is a combinedimage generated (e.g., by image combination module 126 of FIG. 1) fromthe images 302, 304, and 306 of FIG. 3. Image 400 includes regions fromthe base image, as well as corresponding regions from the other imagesthat have replaced regions in the base image. Following with thediscussion of FIG. 3, image 400 includes regions 320, 332, 314, 336, and338.

Returning to FIG. 1, image combining module 126 automatically selects,based on the determinations of how good the regions are perceived asbeing by evaluation module 124, regions to be included in the combinedimage as discussed above. In addition, in one or more embodiments imagecombining module 126 and UI module 106 allow a user to provide inputregarding which one of multiple corresponding regions is to be selectedfor inclusion in combined image 112. This user input can override theautomatic selections made by image combining module 126, oralternatively can be input at different times (e.g., prior to theautomatic selection being made by image combining module 126).

UI module 106 generates, manages, and/or outputs a user interface fordisplay. This user interface allows a user to provide input regardingwhich one of multiple corresponding regions is to be selected. The userinterface can be displayed on a screen of the device implementing userinterface module 106, or alternatively one or more signals can begenerated that are output to one or more other display devices thatinclude a screen on which the user interface can be displayed. A screencan be implemented in a variety of different manners, such as usingliquid crystal display (LCD) technology, plasma screen technology, imageprojection technology, and so forth.

UI module 106 also receives user inputs from a user (e.g., a user of thedevice implementing UI module 106). User inputs can be provided in avariety of different manners, such as by pressing a particular portionof a touchpad or touchscreen, or by pressing one or more keys of akeypad or keyboard. Touchscreen functionality can be provided using avariety of different technologies. The user input can also be providedin other manners, such as via audible inputs, other physical feedbackinput to the device (e.g., tapping any portion of a device or anotheraction that can be recognized by a motion detection component of thedevice, such as shaking the device, rotating the device, etc.), and soforth.

In one or more embodiments, UI module 106 generates a user interfacethat displays, for a particular region of an image, correspondingregions of each of the other images of the multiple images. Thesecorresponding regions can be displayed in different manners, such as ina menu or window adjacent to the particular region, in a ribbon or otherportion of a window, and so forth. The user can provide input via UImodule 106 to select one of the other images, in response to which imagecombining module 126 replaces the region in the combined image with theuser-selected image.

FIG. 5 shows an example of a user interface via which a user can provideinput regarding which of multiple corresponding regions is to beselected in accordance with one or more embodiments. FIG. 5 illustratesimage 400, which is the same image 400 as illustrated in FIG. 4. Inaddition, FIG. 5 illustrates a window 500 adjacent to region 338. A usercan request that window 500 be displayed by providing a variety ofdifferent user inputs (e.g., clicking a particular button of a mousewhen a cursor is displayed on top of region 338, selecting a menuoption, and so forth).

Window 500 displays regions from other images corresponding to aparticular region. In the example illustrated in FIG. 5, window 500includes regions 318, and 328, which are regions corresponding to region338. The user can select (by providing an input via UI module 106 ofFIG. 1) one of regions 318 and 328, in response to which image combiningmodule 126 of FIG. 1 replaces region 338 with the user-selected one ofregions 318 and 328. Thus, it can be seen that the other regions whichcould replace the automatically selected region 338 can be displayed tothe user, and the user can select one of these other regions to replacethe automatically selected region 338. The user can thus easily replacea particular region if he or she prefers a different region.

FIG. 6 is a flowchart illustrating an example process 600 for generatinga combined image from multiple images in accordance with one or moreembodiments. Process 600 is carried out by a device, such as a deviceimplementing image generation module 102 of FIG. 1, and can beimplemented in software, firmware, hardware, or combinations thereof.Process 600 is shown as a set of acts and is not limited to the ordershown for performing the operations of the various acts. Process 600 isan example process for generating a combined image from multiple images;additional discussions of generating a combined image from multipleimages are included herein with reference to different figures.

In process 600, multiple images are accessed (act 602). These multipleimages can be received or obtained in a variety of different manners asdiscussed above.

The multiple images are aligned (act 604). As part of this aligning,corresponding regions of the multiple images are identified as discussedabove. Additionally, if one or more of the multiple images cannot bealigned (e.g., due to their being images of a different scene), thenthose one or more images are deleted from the multiple images.

For each of multiple regions in each of the multiple images, adetermination is made of how good the region is perceived as being (act606). This determination can be made in a variety of different manners,such as by evaluating one or more of various characteristics of theregion and/or based on a learning process as discussed above.

Based on the determinations made in act 606, a base image is identified(act 608). The base image can be identified in different manners, suchas selecting the image having the largest image score, selecting theimage having the largest number of regions that are perceived as beingthe “best” regions, and so forth as discussed above.

A combined image is generated by automatically replacing one or moreregions in the base image with corresponding regions in other imagesthat are perceived as being better (act 610). These regions that areperceived as better are with corresponding regions having higher scores,having higher ranks, that were determined by a learning process as beingthe “best”, and so forth. As discussed above, the resultant combinedimage includes, for each region in the base image, the one of thecorresponding regions that is perceived as being the “best” region. Theone of the corresponding regions that is perceived as being the “best”region can be a region in which an object that is present in one or morecorresponding regions of other images of the multiple images is absentas discussed above.

The combined image generated in act 610 is output (act 612). Thecombined image can be output in a variety of different manners, such asdisplaying the combined image, storing the combined image in aparticular location (e.g., in a file in nonvolatile memory),communicating the combined image to another component or module of thedevice implementing process 600 (or alternatively of another device),and so forth.

FIG. 7 is a flowchart illustrating an example process 700 for selectingregions to include in the combined image based on user input inaccordance with one or more embodiments. Process 700 is carried out by adevice, such as a device implementing image generation module 102 ofFIG. 1, and can be implemented in software, firmware, hardware, orcombinations thereof. Process 700 is shown as a set of acts and is notlimited to the order shown for performing the operations of the variousacts. Process 700 is an example process for selecting regions to includein the combined image based on user input; additional discussions ofselecting regions to include in the combined image based on user inputare included herein with reference to different figures.

In process 700, for a particular region of an image, correspondingregions from other images are displayed (act 702). These correspondingimages can be displayed in different manners, such as in a window ormenu adjacent to the particular region. The image that includes theparticular region can be different images, such as a base image fromwhich a combined image is being generated, a combined image afterregions from different images have been automatically selected forinclusion in the combined image, and so forth.

A user selection of one of the corresponding regions is received (act704). This user selection can be received in response to a variety ofdifferent user inputs as discussed above.

In response to the user selection in act 704, the particular region ofthe image is replaced with the user-selected region (act 706). Thus, forexample, an automatically selected region can be overridden by a user,and the user-selected region included in the combined image rather thanthe automatically selected region.

Process 700 can be repeated for multiple different regions of the image.

Additionally, in one or more embodiments the generating a combined imagefrom multiple images techniques discussed herein can be used during animage capturing process. In such embodiments, in addition to generatinga combined image, a check is made to ensure that at least one of thecorresponding regions is perceived as being good enough. This check canbe performed in different manners. For example, a check can be made forwhether, for each region in an image, that region or a correspondingregion of another image has a score that exceeds a threshold value. Thisthreshold value can be determined in various manners, such asempirically, based on the preferences of the administrator or designer,and so forth. Images continue to be captured until at least one of thecorresponding regions is perceived as being good enough.

For example, a digital camera can have a “group shot” feature that canbe activated by pressing a particular button, selecting a particularmenu option, and so forth. In response to a user request (e.g., pressinga shutter release button) to take a picture with the group shot featureactivated, the digital camera begins capturing and analyzing images. Thedigital camera includes an image generation module (e.g., a module 102of FIG. 1) that identifies regions in the multiple images and determineshow good those regions are perceived as being as discussed above. Thedigital camera continues to capture images until, for each set ofcorresponding regions in different images, at least one of thecorresponding regions in the different images is perceived as being goodenough (e.g., exceeds a threshold value). The digital camera can thencease capturing images because a combined image in which eachautomatically selected region is perceived as being good enough (e.g.,exceeds a threshold value) can be generated. The digital camera canoptionally provide feedback, such as a flashing light or audible tone,indicating that the digital camera has ceased capturing images.

Alternatively, the digital camera can cease capturing images in responseto other events, such as a threshold number of images having beencaptured, images having been captured for a particular amount of time,one or more users being detected is no longer detected in the scenebeing captured, and so forth.

FIG. 8 illustrates an example computing device 800 that can beconfigured to implement the generating a combined image from multipleimages in accordance with one or more embodiments. One or more computingdevices 800 can be used to implement, for example, service 100 of FIG.1.

Computing device 800 includes one or more processors or processing units802, one or more computer readable media 804 which can include one ormore memory and/or storage components 806, one or more input/output(I/O) devices 808, and a bus 810 that allows the various components anddevices to communicate with one another. Computer readable media 804and/or one or more I/O devices 808 can be included as part of, oralternatively may be coupled to, computing device 800. Bus 810represents one or more of several types of bus structures, including amemory bus or memory controller, a peripheral bus, an acceleratedgraphics port, a processor or local bus, and so forth using a variety ofdifferent bus architectures. Bus 810 can include wired and/or wirelessbuses.

Memory/storage component 806 represents one or more computer storagemedia. Component 806 can include volatile media (such as random accessmemory (RAM)) and/or nonvolatile media (such as read only memory (ROM),Flash memory, optical disks, magnetic disks, and so forth). Component806 can include fixed media (e.g., RAM, ROM, a fixed hard drive, etc.)as well as removable media (e.g., a Flash memory drive, a removable harddrive, an optical disk, and so forth).

The techniques discussed herein can be implemented in software, withinstructions being executed by one or more processing units 802. It isto be appreciated that different instructions can be stored in differentcomponents of computing device 800, such as in a processing unit 802, invarious cache memories of a processing unit 802, in other cache memoriesof device 800 (not shown), on other computer readable media, and soforth. Additionally, it is to be appreciated that the location whereinstructions are stored in computing device 800 can change over time.

One or more input/output devices 808 allow a user to enter commands andinformation to computing device 800, and also allows information to bepresented to the user and/or other components or devices. Examples ofinput devices include a keyboard, a cursor control device (e.g., amouse), a microphone, a scanner, and so forth. Examples of outputdevices include a display device (e.g., a monitor or projector),speakers, a printer, a network card, and so forth.

Various techniques may be described herein in the general context ofsoftware or program modules. Generally, software includes routines,programs, objects, components, data structures, and so forth thatperform particular tasks or implement particular abstract data types. Animplementation of these modules and techniques may be stored on ortransmitted across some form of computer readable media. Computerreadable media can be any available medium or media that can be accessedby a computing device. By way of example, and not limitation, computerreadable media may comprise “computer storage media” and “communicationsmedia.”

“Computer storage media” include volatile and non-volatile, removableand non-removable media implemented in any method or technology forstorage of information such as computer readable instructions, datastructures, program modules, or other data. Computer storage mediainclude, but are not limited to, RAM, ROM, EEPROM, flash memory or othermemory technology, CD-ROM, digital versatile disks (DVD) or otheroptical storage, magnetic cassettes, magnetic tape, magnetic diskstorage or other magnetic storage devices, or any other medium which canbe used to store the desired information and which can be accessed by acomputer.

“Communication media” typically embody computer readable instructions,data structures, program modules, or other data in a modulated datasignal, such as carrier wave or other transport mechanism. Communicationmedia also include any information delivery media. The term “modulateddata signal” means a signal that has one or more of its characteristicsset or changed in such a manner as to encode information in the signal.By way of example, and not limitation, communication media include wiredmedia such as a wired network or direct-wired connection, and wirelessmedia such as acoustic, RF, infrared, and other wireless media.Combinations of any of the above are also included within the scope ofcomputer readable media.

Generally, any of the functions or techniques described herein can beimplemented using software, firmware, hardware (e.g., fixed logiccircuitry), manual processing, or a combination of theseimplementations. The terms “module” and “component” as used hereingenerally represent software, firmware, hardware, or combinationsthereof. In the case of a software implementation, the module orcomponent represents program code that performs specified tasks whenexecuted on a processor (e.g., CPU or CPUs). The program code can bestored in one or more computer readable memory devices, furtherdescription of which may be found with reference to FIG. 8. The featuresof the generating a combined image from multiple images techniquesdescribed herein are platform-independent, meaning that the techniquescan be implemented on a variety of commercial computing platforms havinga variety of processors.

Although the subject matter has been described in language specific tostructural features and/or methodological acts, it is to be understoodthat the subject matter defined in the appended claims is notnecessarily limited to the specific features or acts described above.Rather, the specific features and acts described above are disclosed asexample forms of implementing the claims.

What is claimed is:
 1. A method comprising: accessing multiple imageseach including multiple objects; making a determination, for each ofmultiple regions in the multiple images, of how good the region isperceived as being by a device; and generating a combined image from themultiple images based on the determinations of how good correspondingones of the multiple regions in the multiple images are perceived asbeing, the generating including automatically selecting for inclusion inthe combined image a region from one of the multiple images in which anobject that is present in one or more corresponding regions of otherimages of the multiple images is absent, the generating furtherincluding automatically selecting the region in which the object isabsent if the object that is present in the one or more correspondingregions of other images is not identified, based on information in anobject database, as a known object.
 2. A method as recited in claim 1,wherein making a determination, for each of the multiple regions, of howgood the region is perceived as being comprises determining, for each ofthe multiple regions, a score or rank associated with the region basedon one or more characteristics of the region.
 3. A method as recited inclaim 1, further comprising: displaying, for a first region of thecombined image, corresponding regions of one or more other images of themultiple images; receiving a user selection of one of the correspondingregions of the one or more other images; and replacing the first regionof the combined image with the user-selected one of the correspondingregions of the one or more other images.
 4. A method as recited in claim3, wherein displaying the corresponding regions comprises displaying thecorresponding regions in a window adjacent to the first region.
 5. Amethod as recited in claim 1, wherein each of the multiple objectscomprises a face.
 6. A method as recited in claim 1, wherein making thedetermination of how good the region is perceived as being comprisesdetermining whether an object that has been identified by a user isincluded in the region, and making the determination for the regionbased on whether an object that has been identified by the user isincluded in the region.
 7. A method as recited in claim 1, wherein themultiple objects comprise multiple faces, and wherein making thedetermination of how good the region is perceived as being comprisesdetermining whether eyes are detected as being open in a face in theregion, and making the determination for the region based on whethereyes are detected as being open or including a catchlight in the face inthe region.
 8. A method as recited in claim 1, wherein the multipleobjects comprise multiple faces, and wherein making the determination ofhow good the region is perceived as being comprises determining whethera smile is detected as being present in a face in the region, and makingthe determination for the region based on whether a smile is detected asbeing present in the face in the region.
 9. A method as recited in claim1, wherein the device comprises a digital camera that captures themultiple images, the method further comprising continuing to captureimages to include in the multiple images until, for each of the multipleregions in an image, the region or a corresponding region of another ofthe multiple regions has a score that exceeds a threshold value.
 10. Amethod comprising: accessing multiple images each including multipleobjects; making a determination, for each of multiple regions in themultiple images, of how good the region is perceived as being by adevice; and generating a combined image from the multiple images basedon the determinations of how good corresponding ones of the multipleregions in the multiple images are perceived as being, the generatingincluding automatically selecting for inclusion in the combined image aregion from one of the multiple images in which an object that ispresent in one or more corresponding regions of other images of themultiple images is absent, identifying a first image of the multipleimages to be a base image, and generating the combined image from themultiple images by automatically replacing each region of the base imagewith a corresponding region of a second image of the multiple images ifthe corresponding region of the second image is perceived as beingbetter than the region of the first image.
 11. A method as recited inclaim 10, wherein the corresponding region of the second image isperceived as being better than the region of the first image if theobject is absent from the corresponding region of the second image. 12.A method as recited in claim 10, wherein identifying the first image tobe the base image comprises: determining, for each of the multipleimages, how many regions in the image have been determined as being thebest; and selecting, as the base image, the one of the multiple imageshaving a largest number of regions determined as being the best.
 13. Amethod as recited in claim 10, wherein making a determination, for eachof the multiple regions, of how good the region is perceived as beingcomprises determining, for each of the multiple regions, a score or rankassociated with the region based on one or more characteristics of theregion.
 14. A method as recited in claim 10, wherein the devicecomprises a digital camera that captures the multiple images, the methodfurther comprising continuing to capture images to include in themultiple images until, for each of the multiple regions in an image, theregion or a corresponding region of another of the multiple regions hasa score that exceeds a threshold value.
 15. A method as recited in claim10, wherein each of the multiple objects comprises a face.
 16. One ormore computer storage media having stored thereon multiple instructionscomprising instructions that, responsive to execution by one or moreprocessors of a computing device, cause the one or more processors to:access multiple images each including multiple objects; make adetermination, for each of multiple regions in the multiple images, ofhow good the region is perceived by the computing device as being;identify a base image of the multiple images; generate a combined imagefrom the multiple images by automatically replacing each region of thebase image with a corresponding region of another image of the multipleimages if the corresponding region has been determined as being betterthan the region of the base image, wherein the corresponding region hasbeen determined as being better than the region of the base image if anobject that is included in the region of the base image is absent fromthe corresponding region; display regions of each of the other images ofthe multiple images, each of the displayed regions corresponding to aparticular region of the base image; receive a user selection of one ofthe corresponding regions of the other images of the multiple images;and replace the particular region of the base image with theuser-selected one of the corresponding regions of the other images. 17.One or more computer storage media as recited in claim 16, wherein tomake a determination, for each of the multiple regions, of how good theregion is perceived by the computing device as being is to make adetermination, for each of the multiple regions, of a score or rank forthe region based on one or more characteristics of the region.
 18. Oneor more computer storage media as recited in claim 16, wherein todisplay the regions of each of the other images is to display theregions of each of the other images in a window adjacent to theparticular region.
 19. One or more computer storage media as recited inclaim 16, wherein each of the multiple objects comprises a face.
 20. Oneor more computer storage media as recited in claim 16, wherein toidentify the base image of the multiple images is to: determine, foreach of the multiple images, how many regions in the image have beendetermined as being the best; and select, as the base image, the one ofthe multiple images having a largest number of regions determined asbeing the best.